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Abstract 

We analyze the cost used by a naive exhaustive search algorithm for finding a maxi- 
mum independent set in random graphs under the usual ^ ra p -model where each possible 
edge appears independently with the same probability p. The expected cost turns out to be 
of the less common asymptotic order n clogn , which we explore from several different per- 
spectives. Also we collect many instances where such an order appears, from algorithmics 
to analysis, from probability to algebra. The limiting distribution of the cost required by 
the algorithm under a purely idealized random model is proved to be normal. The approach 
we develop is of some generality and is amenable for other graph algorithms. 

MSC 2000 Subject Classifications: Primary 05C80, 05C85; secondary 65Q30. 

Key words: random graphs, maximum independent set, depoissonization, exhaustive search 

algorithm, recurrence relations, method of moments, Laplace transform. 

1 Introduction 

An independent set or stable set of a graph G is a subset of vertices in G no two of which 
are adjacent. The Maximum Independent Set (MIS) Problem consists in finding an independent 

*Part of the work of this author was done while visiting ISM (Institute of Statistical Mathematics), Tokyo; he 
thanks ISM for its hospitality and support. 
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set with the largest cardinality; it is among the first known NP-hard problems and has become 
a fundamental, representative, prototype instance of combinatorial optimization and compu- 
tational complexity; see Garey and Johnson (1979). A large number of algorithms (exact or 
approximate, deterministic or randomized), as well as many applications, have been studied in 
the literature; see Bomze et al. (1999); Fomin and Kratsch (2010); Woeginger (2003) and the 
references therein for more information. 

The fact that there exist several problems that are essentially equivalent (including maxi- 
mum clique and minimum node cover) adds particularly further dimensions to the algorithmic 
aspects and structural richness of the problem. Also worthy of special mention is the following 
interesting polynomial formulation (see Abello et al. (2001); Harant (2000)) 

a(G) = max > Xi — > x^Xj , 

where a(G) denotes the cardinality of an MIS of G (or the stability number) and E is the set 
of edges of G. Such an expression is easily coded, albeit with an exponential complexity. The 
algorithmic, theoretical and practical connections of many other formulations similar to this 
one have also been widely discussed; see Abello et al. (2001). 

One simple means to find an MIS of a graph G is the following exhaustive (or branching or 
enumerative) algorithm. Start with any node, say v in G. Then either v is in an MIS or it is not. 
This leads to the recursive decomposition 




(1.1) 



u£MIS(G) ^eMIS(G) 



where MIS(G) denotes an MIS of G and N*(v) denotes the union of v and all its neighbors. 
Such a simple procedure leads to many refined algorithms in the literature, including alterna- 
tive formulations such as backtracking (see Wilf (2002)) or branch and bound (see Fomin and 
Kratsch (2010)). 

Tarjan and Trojanowski Tarjan and Trojanowski (1977) proposed an improved exhaustive 
algorithm with worst-case time complexity 0(2 n / 3 ). Their paper was followed and refined by 
many since then; see Bomze et al. (1999); Woeginger (2003) and Fomin and Kratsch (2010) 
for more information and references. In particular, Chvatal Chvatal (1977) generalized Tarjan 
and Trojanowski's algorithm and showed inter alia that for almost all graphs with n nodes, a 
special class of algorithms (which he called order-driven) has time bound O(n cologn+2 ), where 
Co := 2/ log 2. He also characterized exponential algorithms and conjectured that a similar 
bound of the form O(n clogn ) holds for a wider class of recursive algorithms for some c > 0. 
Pittel Pittel (1982) then refined Chvatal's bounds by showing that, under the usual £f n p -model 
(namely, each pair of nodes has the same probability p £ (0, 1) of being connected by an 
edge, and one independent of the others), the cost of Chvatal's algorithms (called /-driven, 
more general than order-driven) is bounded between n^~ e ' loSKn and 77,(2+ e ) lo g« n with high 
probability, for any e > 0, where q := 1 — p and k :— 1/q. 

The infrequent scale n clogn = e c(logn ) is central to our study here and can be seen through 
several different angles that will be examined in the following paragraphs. The simplest algo- 
rithmic connection to MIS problem is via the following argument. It is well-known that for 
any random graph G (under the £f np -model), the value of a(G) is highly concentrated for fixed 
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p E (0, 1), namely, there exists a sequence m n such that a{G) = m n or a(G) = m n + 1 with 
high probability; see Bollobas (2001). Asymptotically (k := 1/q), 

m n = 2 log K n - 2 log K log K n + 0(1). 

For more information on this and related estimates, see Bollobas (2001) and the references 
therein. Thus a simple randomized (approximate) MIS-finding algorithm consists in examining 
all possible 

U ) + ( n )=0(n 21o ^ n ) 
m n J \m n + 1J 

subsets and determining if at least one of them is independent; otherwise (which happens with 
very small probability; see Bollobas (2001)), we resort to exhaustive algorithms such as that 
discussed in this paper. 

From a different algorithmic viewpoint, Jerrum Jerrum (1992) studied the following Metropo- 
lis algorithm for maximum clique. Sequentially increase the clique, say K by (z) choose a 
vertex v uniformly at random; (ii) iiv^K and v is connected to every vertex of K, then add 
v to K; (Hi) if v E K, then v is subtracted from K with probability A -1 . He proved that for 
all A ^ 1, there exists an initial state from which the expected time for the Metropolis pro- 
cess to reach a clique of size at least (1 + e) \og K (pn) exceeds n n ^° spn \ See Coja-Oghlan and 
Efthymiou (2011) for an account of more recent developments on the complexity of the MIS 
problem. 

We aim in this paper at a more precise analysis of the cost used by the simple recursive, 
exhaustive algorithm implied by (1.1). The exact details of the algorithm matter less and the 
overall cost is dominated by the total number of recursive calls, denoted by X n , which is a 
random variable under the same £f„ iP -model. Then the mean value fx n := E(A n ) satisfies 



vems(G) 

for n ^ 2, with the initial conditions /i = and /ii = 1, where 



n 1 4 n-l-fcjfc 



7r n)fc := P(i> has n — l — k neighbors) = f ^ jp n q 

How fast does /i n grow as a function of nl (i) If p is close to 1, then the graph is very dense 
and thus the sum in (1.2) is small (many nodes being removed), so we expect a polynomial 
time bound by simple iteration; (ii) If p is sufficiently small, then the second term is large, and 
we expect an exponential time bound; (Hi) What happens for p in between? In this case the 
asymptotics of fi n turns out to be nontrivial and we will show that 

log 



l°g/^ = V 2 ^l J + (I + sh) l °Z n ~ ^ogn + P (log K j^) + o(l), (1.3) 

where Po(^) is a bounded, periodic function of period 1. We will give a precise expression for 
P . Note that 



fx n (logn) 2 Bk b 1o s k 



n |log K n n l 0gK l 0gn _|_^_I^ 



ilog K logn-l-i5^ 

<^n- K ^0, (1.4) 
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for any K > 0, where the symbol a n x b n means that a n and b n are asymptotically of the same 

order. Thus fj, n — o (n^ logK ' n ~ K ^ . On the other hand, the asymptotic pattern (1.3) is to some 

extent generic, as we will see below. 

An intuitive way to see why we have the asymptotic form (1.3) for log fi n is to look at the 
simpler functional equation 

v(x) = v{x — 1) + v(qx), (1.5) 

since the binomial distribution is highly concentrated around its mean value pn, and we expect 
that fx n ps i>{n) (under suitable initial conditions). This functional equation and the like (such 
as v n = v n -x + V[ qn \) has a rich literature. Most of them are connected to special integer par- 
titions; important pointers are provided in Encyclopedia of Integer Sequences; see for example 
A000123, A002577, A005704, A005705, and A005706. In particular, it is connected to parti- 
tions of integers into powers of k = 1/q ^ 2 when k is a positive integer; see de Bruijn (1948); 
Fredman and Knuth (1974); Mahler (1940). It is known that (under suitable initial conditions) 

!°g^) = V 2b °g7 + (I + ii) l°g* - ^glogx + P 1 (log, ^) + o(l), (1.6) 

for large x, where P\{t) is a bounded 1-periodic function; see de Bruijn (1948); Dumas and 
Flajolet(1996). Thus 



+ o(l). 



| log/i n - \ogv{n)\ = |P (log, - Pi (log, ^ 

We see that approximating the binomial distribution in (1.2) by its mean value 

E(/i n _i_Binom(n-l;p)) ~ /Ai-l-E(Binom(n-l;p)) ~ ^\_qn\ 

gives a very precise estimate, where Binom(n — l;p) denotes a binomial distribution with 
parameters n — 1 and p. 

An even simpler way to see the dominant order x clogx is to approximate (1.5) by the delay 
differential equation (since v{x) — v{x — 1) ~ v'{x) for large x) 

oj'{x) = u>(qx), (1.7) 

which is a special case of the so-called "pantograph equations " 

cj'(x) = au(qx) + boj(x), 

originally arising from the study of current collection systems for electric locomotives; see Iser- 
les (1993); Kato and McLeod (1971); Ockendon and Tayler (1971). Since the usual polynomial 
or exponential functions fail to satisfy (1.7), we try instead a solution of the form u(x) = x clogx ; 
then c should be chosen to satisfy the equation 



x 



L-2cio gK = 2ce c( - loeK) logx. 



So we should take c = 1/(2 log k) + 0(x 1 logx). This gives the dominant term 2°iogi ^ or 
logu(x). More precise asymptotic solutions are thoroughly discussed in de Bruijn (1953); Kato 
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and McLeod (1971). In particular, all solutions of the equation u'(x) = au(qx) with a > 
satisfies 



log 



lo S^) = V 2 ^J + (§ + li + ^ - (l + iff) loglog^ 
+ Mlog Ki ^)+o(l), 

for large x, where is a bounded 1 -periodic function. We see once again the generality of 
the asymptotic pattern (1.3). 

On the other hand, the function 



w[x) := exp 



{\og{x/^q)\ 
21og(l/g) 



satisfies the g-difference equation 

w{x) = xw(qx), 



and is a fundamental factor in the asymptotic theory of g-difference equations; see the two 
survey papers Adams (1931); Di Vizio et al. (2003) and the references therein. This equation 
will also play an important role in our analysis. 

From yet another angle, one easily checks that the series 

M(x) -=J2^~ x3 

satisfies the equation (1.7). The largest term occurs, by simple calculus, at 

j « log K x - log K log K x + I + o(l), 

and, by the analytic approach we use in this paper, we can deduce that the logarithm of the series 
is, up to an error of 0(1), of the same asymptotic order as log u(x); see (1.6) and Section 6. The 
function M(x) arises sporadically in many different contexts and plays an important role in the 
corresponding asymptotic estimates; see below for a list of some representative references. 

A closely related sum arises in the average-case analysis of a simple backtracking algorithm 
(see Wilf (2002)), which corresponds to the expected number of independent sets in a random 
graph (or, equivalently, the expected number of cliques by interchanging q and p) 

Jn:= £ ("V- 1)/2 , (1.8) 

see Matula (1970); Wilf (2002). Wilf Wilf (2002) showed that J n = O{n lo ^ n ) when p = 
1/2. While such a crude bound is easily obtained, the more precise asymptotics of J n is more 
involved. First, it is straightforward to check that J n ~ M{n) for large n. Second, the approach 
we develop in this paper can be used to show that J n has an asymptotic expansion similar to 
(1.3). Indeed, it is readily checked that J n + 1 satisfies the same recurrence relation as p, n with 
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Figure 1: The connection between MIS-finding algorithms and the scale n c g ™ (discrete) or 
x c\ogx (continuous). The circles on the right-hand side are more algorithmic in nature, while 
those on the left-hand side more analytic in nature. 

different initial conditions. So the asymptotics of J n follows the same pattern (1.3) as that of 
/i n ; see Section 6 for more details. 

Thus examining all independent sets one after another in the backtracking style of Wilf 
Wilf (2002) and identifying the one with the maximum cardinality also leads to an expected 
n c log " -complexity. 

The diverse aspects we discussed of algorithms or equations leading to the scale n clogn are 
summarized in Figure 1 . The bridge connecting the algorithms and the analysis is the binomial 
recurrence (1.2) as explained above. 

This paper is organized as follows. We derive in the next section an asymptotic expansion 
for jx n using a purely analytic approach. The interest of deriving such a precise asymptotic 
approximation is at least fourfold. 

Asymptotics: It goes much beyond the crude description n clogn and provides a more 
precise description; see particularly (1.4) and its implication mentioned there. Indeed, 
few papers in the literature address such an aspect; see de Bruijn (1948, 1953); Dumas 
and Flajolet (1996); Kato and McLeod (1971); Pennington (1953); Richmond (1976). 

Numerics: All scales involved in problems of similar nature here are expressed either in 
log or in log log, making them more subtle to be identified by numerical simulations. The 
inherent periodic functions and the slow convergence further add to the complications. 

Methodology: Our approach, different from previous ones that rely on explicit generating 
functions in product forms, is based on the underlying functional equation and is of some 
generality; it is akin to some extent to Mahler's analysis in Mahler (1940). 
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Generality: The asymptotic pattern (1.3) is of some generality, an aspect already exam- 
ined in details in several papers; see for example de Bruijn (1953); Dumas and Flajolet 
(1996); Kato and McLeod (1971). See also the last section for a list of diverse contexts 
where the order n clogn appears. 

Alternative approaches leading to different asymptotic expansions are discussed in Sec- 
tion 3. 

The next curiosity after the expected value is the variance. But due to strong dependence 
of the subproblems, the variance is quite challenging at this stage. We consider instead an 
idealized independent version of X n (the total cost of the exhaustive algorithm implied by 
(1.1)), namely 

Y n = Y n -\ + ^C-l-Binom(n-l;p) ( n ^ 2), (1.9) 

with Y\ :— 1 and Y := 0, where "=" stands for equality in distribution, Y* is an identi- 
cal copy of Y n and the two terms on the right-hand side are independent. The original ran- 
dom variable X n satisfies the same distributional recurrence but with the two terms (X„_x and 
^_i_Binom(n-i P )) on me right-hand side dependent. We expect that Y n would provide an 
insight of the possible stochastic behavior of X n although we were unable to evaluate their dif- 
ference. We show, by a method of moments, that Y n is asymptotically normally distributed in 
addition to deriving an asymptotic estimate for the variance. Monte Carlo simulations for n up 
to a few hundreds show that the limiting distribution of X n seems likely to be normal, although 
the ratio between its variance and that of Y n grows like a concave function. But the sample size 
n is not large enough to provide more convincing conclusions from simulations. 

Once the asymptotic normality of Y n is clarified, a natural question then is the limit law of 
the random variables (by changing the underlying binomial to uniform distribution) 

Z n = Zn-l + Zuniform(0,n-1) (fl ^ 2), (1-10) 

with Zq = and Z\ = 1. In this case, we prove that the mean is asymptotic to cn~ 1//4 e 2v/ ™ and 
the limit law is no more normal. We conclude this paper with a few remarks and a list of many 
instances where n clogn arises, further clarifications and connections being given elsewhere. 

Notations. Throughout this paper, 0<p<l, q = l— p, and k = 1/q. 



2 Expected cost 

We derive asymptotic approximations to jj n in this section by an analytic approach, which is 
briefly sketched in Figure 2.1. 



2.1 Preliminaries and main result 

Recall that X n denotes the cost used by the exhaustive search algorithm (implied by (1.1)) for 
finding an MIS in a random graph, and it satisfies the recurrence 

Xn — X n _i + X n _i_Binoni(n-l;p); (2-1) 
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with X Q = and X% = 1, where X* = X n , and the two terms on the right-hand side are 
dependent. 

From (2.1), we see that the expected value /i„ of X n satisfies the recurrence (1.2). Our ana- 
lytic approach then proceeds along the line depicted in Figure 2.1. While the approach appears 
standard (see Flajolet and Sedgewick (2009); Jacquet and Szpankowski (1998); Szpankowski 
(2001)), the major difference is that instead of Mellin transform, we need Laplace transform 
since the quantity in question is not polynomially bounded. Also the diverse functional equa- 
tions are crucial in our analysis, notably for the purpose of justifying the de-Poissonization, 
which differs from previous ones; see Jacquet and Szpankowski (1998); Szpankowski (2001). 



Recurrence relation 



Poisson-Charlier expansion 



~ f{n) - ^/"(n) 



de-Poissonization 



2tH 



-"-V/(z)ck 



Poisson generating function 



f(z) = f(qz) + e- 



Modified Lapl 


ace transform 


f*( S ) = s ;*(g S ) + I A- 






Inverse transform 


fa) = ^7 

2m J 


/ —f*( S )ds 



Figure 2: Our analytic approach to the asymptotics of ji n . Here ir nj k '■= ( n k )q k p n 1 



Generating functions (GFs). Let f(z) := J2 n >o ^nZ n /n\ denote the exponential GFs of \i n . 
Then / satisfies, by (1.2), the equation 

f(z) = l + f(z) + e^f(qz), 

with /(0) = 0, or, equivalently, denoting by f(z) := e~ z f(z) the Poisson GF of [i n , 

f'(z) = f(qz) + e-\ (2.2) 

with /(0) = 0. 

Closed-form expressions. Let f(z) = J2 n >o f l nZ n /n\. From the g-differential equation 
(2.2), we derive the recurrence 

/W = q n K + (-i) n 

By iteration, we then obtain the closed-form expression 

j2 n = (-l)V n ~ W)(n+J ' )/2 (n > 1). 
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Since f(z) = e z f(z), we then have 

^ = E (*) E (-ljV*- 1 -^^ (n > !)• (2-3) 

This expression is, although exact, less useful for large n; also its asymptotic behavior remains 
opaque. See also (3.4) for another closed-form expression for \i n . 

Asymptotic approximations. Our aim in this section is to derive the following asymptotic 
approximation. 

Theorem 2.1. The expected cost \i n of the exhaustive search on a random, graph satisfies 



n l/log*+l/2 

H n = -= exp 

V27T log K n 



2 log K 
\ ) 



l + [«)l, (2.4, 



as n — > oo, where G{u) is defined by ( {u} being the fractional part ofu) 

n j(j+l)/2 

G(U) = g(M 2 -«)/ 2 Y q ' g-iW, 

(see (2.8) J and is a bounded, 1-periodic function ofu. 
Note that (2.4) implies (1.3) with 

P (u) = -§log27r-log/« + logG(V). 

Our approach leads indeed to an asymptotic expansion, but we content ourselves with the state- 
ment of (2.4); see (2.18), (2.23) and (3.3). 

The function / (and thus /) is an entire function. It follows immediately that we have the 
identity (see Hwang et al. (2010)) 

(referred to as the Poisson-Charlier expansion in Hwang et al. (2010)) where the Tj{n)'s are 
polynomials of n of degree \n/2\ \ see (2.24). See also Jacquet and Szpankowski (1998) for 
different representations. However, the hard part is often to justify the asymptotic nature of the 
expansion, namely, 



0<?<J 



for J = 2, 3, . . . . In particular, the first-order asymptotic equivalent "p, n ~ f(n)" is often 
called the Poisson heuristic. Thus the asymptotics of p, n is reduced to that of f(x) once we 
justify the asymptotic nature of the expansion. Of special mention is that, unlike almost all 
papers in the literature, we need only the asymptotic behavior of f(x) for real values of x, all 
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analysis involving complex parameters being carefully handled by the corresponding functional 
equation. 

We will derive an asymptotic expansion for f(x) for large real x by Laplace transform tech- 
niques and suitable manipulation of the saddle-point method, and then bridge the asymptotics 
of n n and f(n) by a variant of the saddle-point method (or de-Poissonization procedure; see 
Jacquet and Szpankowski (1998)); see Figure 2.1 for a sketch of our proof. 

2.2 Asymptotics of f(x) 

We derive an asymptotic expansion for f(x) in this subsection. 

Modified Laplace transform. For technical convenience, consider the modified Laplace 
transform 

1 f°° 

f*(s):=- e-*/ s f(x)dx. 
s Jo 

Note that this use of the Laplace transform differs from the usual one by a factor 1/s and 
by a change of variables s >->■ 1/s. Also the use of the exponential GF coupling with this 
Laplace transform is equivalent to considering the ordinary GF of /i ra ; see Section 3.2 for more 
information. 

Then the functional-differential equation (2.2) translates into the following functional equa- 
tion for /* 

rOO = sf*(q8) + (2.5) 
1 + s 

for 9fc(s) > 0. 

Iterating the equation (2.5) indefinitely, we get 

/J'(j'+l)/2 

m 1 + qs 

We will approximate f*(s) for large s by means of the function 

\ -s j+1 , 

1 + qis 

— oo<j'<oo 

because adding terms of the form s~ J , j ^ 0, does not alter the asymptotic order of both 
functions. 

Lemma 2.2. For x > 1, we have 

F{x) = x 1 ' 2 exp f G (log K x) , (2.7) 

where 

G{u) := g (W 2 +M)/2 F ( g -M) (2 . 8) 
w a continuous, positive, periodic function with period 1. 
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Proof. One can easily check that F(s) satisfies a functional equation similar to that of Jacobi's 
theta functions 

F(s) = sF(qs) (s E C). (2.9) 
Iterating iV times this functional equation, we obtain 

F( S ) = q mN-l)/2 s N F ( q N s j (s e C) _ (2 1Q) 

Assume x > 1. Take 

N = [\og K x\ = log K x + rj, 
where rj = — {log K x}. Then we have 

F(x) = exp ^ N( " N 2 ~ - logg + iVlogxJ F (e^s^s*) 
/ (logx) 2 logx L 77(77 — 1) \ , j v 

\ 2 log « J v 7 

which, together with the functional equation F(l/q) = F(l)/q (or G(u + 1) = proves 
the lemma. □ 

Asymptotic expansion of f(x): saddle-point method By the inversion formula, we have 

-y rr+ioo ^xs _ / -y 



fiX) = ^U^ r {s) AS ' <2 - U) 

where r > is a small number whose value will be specified later. We now derive a few 
estimates for f*(s). 

Lemma 2.3. (i) Ifr > and \t\ ^ I, then 

r ( An) = (n) ; ( 2 - 12 ) 

(ii) ifO < r ^ 1 and |t| ^ 1, ?/zen 



r + it J \\t 



f'{vh)^ F (viu) +oil) ' (213) 

(iii) ifr > and c m r ^ |£| ^ 1, where c m := g-2m _ 1^ m ^ 1, then 

/*(^)=°('""' ( " )F (;))- (2 - 14) 

Proof. First, (2.12) follows from (2.6). For the estimate (2.13), we observe that 

1 



1 + sqJ 



< mm{q- j \s\-\l} (9fc(s)>0). 
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Then 

r(s) = F(s)+o(\ s r) 

for 3R(s) ^ and \s\ ^ c> 0. Also for r > 

1 



and, for \t\ ^ 1 and < r ^ 1 



i = ~a — n > 0; 

r + it J r z + t 2 



1 1 



\r + it\ y/2 

From these two estimates, we then deduce (2.13). 
On the other hand if 3?(s) ^ 0, then 



J>0 



where 



J'0'-1)/2„J 



i?(a?) := g JW """x J . 

— oo<j<oo 

It is easily checked that satisfies the same functional equation (2.9) as F(x), namely, 

"&(x) = x$(qx). 

Thus, by the same arguments used for F(x), we have, for x > 1, 

(logx) 2 ' 



= x 1 ^ 2 exp 



21ogK 



where c/ (a;) is a continuous, bounded, periodic function. Comparing this expression with (2.7) 
for F(x), we conclude that $(x) = 0(F(x)) for x ^ 1. 
Let c m := \Jq- 2m - 1, m > 1. Then, for < r < 1, 



max 



r 



r + it 



$J max 

c m r^|t|s£l 



1? 



Vr 2 + 1 2 



O (r m q^)F(l/r)) . 



r m q m{m - 1)/2 $(l/r 



This proves (2.14) and the lemma. 



□ 



By splitting the integral in (2.11) into three ranges \t\ < c m r, c m r < \t\ < 1, and \t\ > 1, 
and then applying the estimates (2.12) and (2.14), we deduce that 



f(x)e~ xr = I r (x) + O lr m - L q^)F(l/r) + 1 , 



(2.15) 



where 



IJx) :-- 



2tt 



Cmr r + it \r + it 
12 



dt. 



It remains to evaluate more precisely the integral I r (x) by the saddle-point method. 
We now take 

N= llog„(l/r)J = log K (l/r)+??, 
where r\ = — {log K (l/r)}. Applying the functional equation (2.10) with s = l/(r + it), we get 

i rc m r ixt n N(N-l)/2 / rr m \ 

Irk) = — / -r^-^fzrF dt. 



By the relation 
we then have 



F(l/r) = q N ^ N - 1 ^ 2 r~ N F(q^ 



/ ( x) = F (V r ) r r e *- (—) N+1 F NV( r +^)) dt 

r{ ) 2-kt J_ Cmr \r + it) F(qv) 

F(l/r)e- [«* eirxt f 1 \ N+l F(q"/(l+it)) 



where 



2t i_ c „ K.1 + UJ F(qi) 

— Cm. 



rr/ f 1 . = aT(»f-log(l+it)+t 2 /2) ^XgV I 1 + j0) 

U ' (1 + ity+vF(qV) 



We now choose r = r(x) > to be the approximate saddle-point such that 

— log — = x log k. (2.16) 

Note that r can be expressed in terms of the Lambert-W function (principal solution of the 
equation W(x)e w ^ = x) as 

W(x\og k) 
r = : ; 

XlOgK 

thus log(l/r) = W(xlogn). Asymptotically, 

log log x (log log x) 2 - 2 log log x /(log log 
W (x) = log x — log log x H 1 —j- ;- h O 



\ogx 2(loga;) 2 \ (logx 



(2.17) 



as x — > oo; see Corless et al. (1996). 

Since m > 1 is arbitrary and r x x~ l log x, the relation (2.15) is an asymptotic approxima- 
tion, albeit less explicit. 

To derive a more explicit expansion, we first observe that 

e xr F(l/r) = r- 1/losK - 1/2 e (log(1/r))2/(21ogK) G(log K (l/r)), 

by (2.7) and (2.16). Then what remains is standard (see Flajolet and Sedgewick (2009)): eval- 
uating the integral in (2.15) by Laplace's method (a change of variable t 1-)- t/y/xr followed by 
an asymptotic expansion of H(t/y/xr) for large xr and then an integration term by term), and 
we obtain the following expansion. 
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Proposition 2.4. With r given by (2.16), f(x) satisfies 

(log(l/r)) 2 /(21og K ) G( ' 1 (l/r)) ( ^ I 



r 1 / lo s K+1 /2 ^tt log K (l/r) 



J>1 



a* x — > oo, where G is given in (2.8) and the (j)j(u)'s are bounded, 1-periodic functions of u 
involving the derivatives of F [q^^). 

In particular, 

1 {«}(!-{«}) (l-W)fWf'(fW) f 2{u} r ( g -W) 



0i (u) 



12 2 F(g-W) 2F(g-M) 



By using (2.17), the leading term in (2.18) can be expressed completely in terms of logx as 
follows. 

Corollary 2.5. As x — > oo, f(x) satisfies 



G ( lo S K !^b) x y^ g .+i/2 I (log^) \ / /(log log x 



f(x) = v / . — exp I o i i + o 




27T log K X I 2 log K \ \ log X 



This is nothing but (2.1) with n there replaced by x. 

As another consequence, we see, by (2.2) and (2.19), that 

f'(x) _ /(gap _ log,, a 
/>) ~ />) ~ x 

More generally, we have the following asymptotic relations for f^\x) and f(q^x). 

Corollary 2.6. For j ^ 1 

, (i _ 1)/2 /^log K x 



(2.19) 



(2.20) 



Note that (2.20) also follows easily from the integral representation 

~ • 1 r r + io ° e xs _ / -j_ \ 

37 27rz/_ i00 sJ- 1 ^ V s / 

and exactly the same arguments used above. 
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2.3 Asymptotics of fi n 

We first derive a simple lemma for the ratio f(x + y)/ f(x) when y is not too large by using 
(2.20). 

Lemma 2.7. Assume x > 1. If\y\ = o(x/logx), then 

IkM =i + (\y\^). (2.22) 
fix) V x J 

Proof. By (2.20), we have 

fjx + y) f 1 f'(x + yt) 
log — f—— = V / t, r dt 



f(x) Jo f(x + yt) 

g\x_ 
\x + yt\ 



yO[ / l lQg|x + ^ dt 



\y\ 


log 


x\ 




x\ 





= o 

from which (2.22) follows. □ 

Theorem 2.8. The expected cost used by the exhaustive search algorithm satisfies the asymp- 
totic expansion 

fU)( n ) 

where Tj(n) is a (Charlier) polynomial in n of degree [j/2\ defined by 



r A n ) 



£ 0-0,1,...). (2.24) 



In particular, r (n) = 1, Ti(n) = 0, r 2 (n) = — n, r 3 (n) = 2n, and r 4 (n) = 3n 2 — 6n. Thus, 
by (2.18) and (2.20), 

/i n = /(n) (l + Of^Qogn) 2 )), 

which proves Theorem 2.1. 

Proof. For simplicity, we prove only the following estimate 

^ = /» - | /» + O (n- 2 (logn) 4 /(^)) • (2.25) 

The same method of proof easily extends to the proof of (2.23). 

We start with the Taylor expansion of f(z) at z = n to the fourth order 

f"(n) f"'(n) 
f(z) = f(n) + f'(n)(z -n) + J -j^(z - nf + J -AA( Z _ n f + ( z _ n )* R ( z)) (2 .26) 
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where 



R{z) = h. [" f [4) (n + (z-n)t)(l-tfdt. 



o 



By applying successively the equation (2.2), we get 

f^(z) = - e ~ z + q 3 e- qz - q 5 e~ q2z + q 6 e~ q3z + q 6 f{q 4 z). 

It follows that 

I R ( nel6 ) I < / I / (4) ( n + < ei9 ~ !)*) I dt 
Jo 

= O ^ e - ncose + e - q3ncose + J \f(q 4 n + q 4 n[e l6 - l)t) \ dtj , 

for \9\ ^ 7r. Replacing first f(z) inside the integral by e~ z f(z), using the inequality \ f(z)\ ^ 
f(\z\) and then substituting back f(q 4 n) by e q4n f(q 4 n), we then have 



\R (ne ie ) | = O (V" 3 " cose + f(q 4 n) j\ e -^- q ^-i)t\ ^ 

= O ^ e - q3ncose + f(q 4 n) jf e q4n{1 - cose)t dtj 

= O (e-i 3ncose + f(q 4 n )e q4n{1 ~ cose) ^ , (2.27) 
uniformly for \9\ ^ 7r. By Cauchy's integral formula and (2.26), we have 



2m 



Hn= ^ i z- n - l e z f(z)dz 



\z\=n 



where 



- ^ , » V (/(n) + - n) + - n) 2 + - »)' | dz 

+ R n 

R n : = — I z~ n - l e z {z - n) 4 R(z) dz. 
2 ni J\ z \ =n 



By the estimate (2.27) for R(z), we have 

R n = (n\n A - n [ 6 4 e ncosd \R(ne ie )\d9 



O (n\n 4 - n J 6 4 e ncos6 (V" 3ncose + f( y q 4 n)e q4n{1 - cose) ^ doj 

O (n\n*~ n J 9 4 e n{1 - q3) cos9 d9 + n\f(q 4 n)n 4 - n e n J 04 e -(i-«*Mi-«**) dQ j 

O (n\n~ n+3 / 2 e (1 - q3)n + n!e"n-" +3/2 /(g 4 ™)) 
O (n 2 e- q3n + n 2 f(q 4 7 



.. .. a. 

4 " 



O ( n (log n) f(n) 
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by (2.21). Note that again by (2.20) 

nf"'(n) = O (n- 2 (logn) 3 /(n)) , 
so this error bound is absorbed in 0(f(n)n~ 2 (\ogn) 4 ). This proves (2.25). □ 

3 Alternative expansions and approaches 

We discuss in this section other possible approaches to the asymptotic expansions we derived 
above. 

3.1 An alternative expansion for f(x) 

We begin with an alternative asymptotic expansion for f(x), starting from the integral repre- 
sentation (2.1 1), which, as showed above, can be approximated by 

i rr+ioo xs / 1 \ 

/» = — / — F - ds + 0(l) 



2%i ./ r _j 00 s \ s 



For simplicity, we will write this as 



i pr+ioc xs / 1 

f(x)~— / —F(-)ds. 

2m Jr-ioo S \S 



Now we use the same N = |_log w (l/r)J = log K (l/r) — r\ and 



so that 



o(a) f r+lco e xs fa N \ 
2« J r _ ioo s N+1 \s J 

Now instead of expanding F(q N / (r + it)) at £ = 0, we expand F(q N / s) at s = r, giving 
V s / V ?~ \ s/ / ' m! m V s 

\ / \ 7 m^O 

where Q := q N jr = q^^O-M} and Fj denotes F^\Q). Substituting this expansion into the 
integral representation (3.1) and then integrating term-by-term, we obtain 



x N x - (-l) m Q m , . 

= ^|E i fmW, (3-2) 

TV! *— ' ml 
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where, by the integral representation for Gamma function (see Flajolet and Sedgewick (2009)), 



r+too 



T m (N):=— _ 1_- d* 



2lli .'r— ioo S^^' 



s 



E 

0<.j'<m 



m\ • N\x J 



For computational purposes, it is preferable to use the recurrence 

T m (N) = T m _x(iV) - T m -i(N + 1). 



The value of r is arbitrary up to now. If we take r = N/x, then 

N\N j 

[ J'-"'- 

Note that |T m (iV)| x N~^ 2 \ In particular, 



r.(jv):= £ (7)(-iy 



(N + j)l 



T (N) = 1, T\(iV) = — !— , T 3 (N) A 2 



iV + l' v ' (iV + l)(JV + 2)' 

Since g^/r remains bounded, we can regroup the terms and get an asymptotic expansion in 
terms of increasing powers of N^ 1 , the first few terms being given as follows 

f{x) _ Q(2F! + F 2 Q) g(3F 4 g 3 + 28F 3 Q 2 + 60F 2 g + 2AF 1 ) 

-71^ -F — h 



q Vi)x N /N\ 2N 24iV2 

_ Q(F 6 Q 5 + 22F 5 Q 4 + 152F 4 Q 3 + 384F 3 g 2 + 312F 2 Q + 48^; 

48A^3 

+ ••• . 

On the other hand, if we choose r = (N + l)/x, then Ti(iV) = and 
T (N) = 1, T 2 (N) = -— !-r, T 3 (iV) ' 



so that 



iV + 2' dV 7 (iV + 2)(iV + 3)' 



/» _ iE1 + g 3 (3F 4 g + i6F 3 



q(")x N /N\ 2(iV + 2) 24(iV + 2) 2 

_ g 3 (F 6 g 3 + i6F 5 g 2 + 60F 4 g + 32F 3 ) 

48(iV + 2) 3 



+ 



While \T m (N)\ x JV-fW*] for m ^ 2 as in the case of r = N/x, this is a better expansion 
because the first term incorporates more information. 

The more transparent expansion (3.2) is a priori a. formal one whose asymptotic nature 
can be easily justified by the same local analysis as above, details being omitted here. We 
summarize the analysis in the following theorem. 
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Theorem 3.1. The Poisson generating function of //„ satisfies the asymptotic expansion 



w/iere iV = [log K (l/r)J = log K (l/r) - r], r := N/x, Q := q-^^M a ndT m (N) is defined 
by 



Straightforward calculations give (when r = N/x) 

2 



log ? u; T77 = ^"j — + \ + o lo g^ - log log a; 

NIJ ilogK \logK 2/ 

1 V 2 + V t n f (log log x) 

log 2,71 h U ' 



2 2 V log^ 

consistent with what we proved in (2.19) via directly applying the saddle-point method. For 
similar types of approximation, see Heller (1971); Mahler (1940). 

3.2 Exponential GFs vs ordinary GFs 

The different forms of the GFs of the sequence fi n have several interesting features which we 
now briefly explore. 

Instead of f*(s), we start with considering the usual Laplace transform of f(z) 

POO 

Sf(s) = / e~ xs f(x)dx, 
Jo 

which, by (2.6), satisfies 

q\ 2 



By inverting this series, we obtain 



(i+i 



f( z ) = <L -^z ]+1 I e- qJuz {\ - uy du. 



i 

j>o J 

From this exact expression, we deduce not only the exact expression (2.3) but also the following 
one (by multiplying both sides by e z and then expanding) 

x ^ (n - 1\ (j+i) x ^ fn-l-]\ g^(l - qiy-^-i-t 

"• = »E( i E ( e ) ,+V+i • (34) 

where all terms are now positive; compare (2.3). But this expression and (2.3) are less useful 
for numerical purposes for large n. 
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On the other hand, the consideration of our f*(s) bridges essentially EGF and OGF of fi n . 
Indeed, 



f*( s ) = - [°° e - x ~ x/s Y^x n dx 



which is essentially the Euler transform of the OGF; see Flajolet and Richmond (1992). 

Our proofs given above rely strongly on the use of EGF, but the use of OGF works equally 
well for some of them. We consider the general recurrence (4.6). Then the OGF A(z) : = 
Yun>i a n zn satisfies 

A(z) = zA(z) + — A (j^-) + B(z), 
1 — pz \ 1 — pz J 

where B(z) := J2 n ^i ^nZ n - Thus A(z) := (1 — z)A(z) satisfies 

A(z) = B(z) + ^—A ^ <r ~ 



1 — z \ 1 — pz 
which after iteration gives 

Thus 

Closed-form expressions can be derived from this; we omit the details here. 



4 Variance of Y n 

We derive in this section the asymptotics of the variance Y n (see (1.9)), which can be regarded 
as a very rough independent approximation to X n . We use an elementary approach (no com- 
plex analysis being needed) here based on the recurrences of the central moments and suitable 
tools of "asymptotic transfer" for the underlying recurrence. The approach is, up to the de- 
velopment of asymptotic tools, by now standard; see Hwang (2003); Hwang and Neininger 
(2002). The same analysis provided here is also applicable to higher central moments, which 
will be analyzed in the next section. 



4.1 Recurrence 

For the variance of Y n , we start with the recurrence (1.9), which translates into the recurrence 
satisfied by the moment GF M n (y) := E (e YnV ) 

M n {y) = M n ^{y) £ rc^M^y) (n ^ 2), 
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with M (y) = 1 and M x (y) = e y , where ir nd := ( n ' j 1 )q j p n ^ j . This implies, with M n (y) : = 
e~^ y M n (y) = E ( e ( y ™-^), that 

M n (y) = M n -i(y) TTnjM^e^ (n > 2), (4.1) 

Osgj<n 

with M n (y) = 1 for n < 2, where 

Let M n>m := E(Y n - /i n ) m = M^ m) (0), m ^ 0. Then from (4.1), we deduce that 

M n>m = M n _i >m + 22 ^n.jMj^m + T njjn , (4.2) 

where, for m ^ 1, 



0<it/<m 



= E (?) E ^/a-' 
+ E E ( m ;*)E-".^ A ™7" («) 

Note that since M n> i = and ^o<j<n ^nj^nj = 0, terms with fc = 1 and k — m — 1 vanish. 
In particular, the variance <r^ = M n>2 satisfies 



where 



0<i<n 



4.2 Asymptotics of T n >2 

To proceed further, we first consider the asymptotics of A nj - for j = qn + 0(n 2 / 3 ). By Taylor 
expansion and (2.2), we have 

/(n) - /(n - 1) = f{n) - + ^ + O ( Al - t) 4 / (4) (™ - f) df 



and 



/V)-^ + ^ + o(/(A)) 



/»-/>-l) = /'» + 0(/ ( g V)]. 
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These and (2.25) yield 

Mn - Mn-x = f'(n) - + O (n 2 ~f (g 4 n)~ 

= /(gn)+o(n 2 /(A)), 
since / (g 2 n) = O ^n 2 (log n)~ 2 f (g 4 n) \ . Then, for j = qn + Xy/pqn, \x\ ^ n 1 / 6 , 

= f{qn + Xyjpqn) — f(qn) + O (n 2 f (q 4 n) j 
= f'(qn)x^/pqn + O (n 2 (l + x 2 )f (q 4 n)^j . (4.4) 
Thus, by (2.20) and (2.21), 



T„, 2 = £ 7r n)i |/'(gn>^+0(n 2 /(g 4 n))| + O /i 2 ^ 
= pqnf'(qn) 2 ^ vr nj |a;| 2 + O (n 9 / 2 / 2 (g 4 n)) 
= pqnf'(qn) 2 + O (n 9 ' 2 f 2 (g 4 n)) 

~g-V- 3 (log K n) 4 /H 2 . (4.5) 
The next step then is to "transfer" this estimate to the asymptotics of the variance. 

4.3 Asymptotic transfer 

We now develop an asymptotic transfer result, which will be used to compute the asymptotics 
of higher central moments of Y n (in particular the variance). 

More generally, we consider a sequence {a n } n ^ satisfying the recurrence relation 

a n = a n -i + 22 ^njcij + K {n ^ 1), (4.6) 

0sgj<n 

where a is finite (whose value is immaterial) and {&„}„^i is a given sequence. 
Lemma 4.1. Ifb n ~ (log n)tf(n) a , where a>0J,(Gl. Then 

y^bj ~ — — b n . 

a log„ n 

Proof. Define <p(t) := ^ '(log t) ? f(t) a . By assumption, 6 n ~ ip(n). Since f'{t)/f{t) ~ 
£ _1 log K £ (by (2.20)), we see that ip'(t) > for i sufficiently large, say t ^ t > 0. Thus 
(p(t) is monotonically increasing for t ^ t . Then 



/■n 
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By the asymptotic relation (2.20), we have 

rn 

<p(t)dt= / t"(logt)*/(t) a dt 



(log«) / ^ +1 (logt)«- 1 /W a-1 / , (*)d* 

log K 



a 



^'(logt^-'d/W 



rMn) +0 (f"m it 



alog K n 

by an integration by parts. The integral on the right-hand side is easily estimated as follows. 

jH ^ dt = O L>(qn) J*" r 1 dt + (p(n) J" r 1 dtj 
= 0(<p(n)). 

This proves the lemma. □ 
Proposition 4.2. Ifb n ~ (log n)^f(n) a , where a > 1, /3, £ G M, 



a n = (l + 0(n 1 -«(logn) a - 1 )) V ~ - 6 n . (4.7) 

Proof. We start with obtaining upper and lower bounds for a n . Since b n > for sufficiently 
large n, say n ^ n . We may, without loss of generality, assume that b n ^ for n ^ n 
(for, otherwise, we consider 6^ := 6 n + max JsSno |6 3 -| and then show the difference between the 
corresponding a' n and a n is of order f(n)). Then a n ^ and, by (4.6), we have the lower bound 



Now consider the sequence 



a n ^ a n _i + 6 n ^ E 6,. 



Cn := v Q " ^ 1 (n^l), 



and the increasing sequence 
Then we have the upper bound 



for all k ^ n. 

In view of the recurrence relation (4.6), we have 

On < Q-l E h i + C n-1 E ^ E ^ + & " 
0^j<n 0<7*<n O^i^j 

< C*n-1 E 6 i + C n-1 E ^ E 6 * 
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By Lemma 4.1 and Corollary 2.6, we see that there exist an absolute constant K > such that 
^ J2 b ^ Kn- a {\ogn) a J2 b i = ° K^( lo S nT^K) . (4.8) 

It follows that 

a„^C:_ 1 (l + iirn- a (log7i)«) h r 

By our definition of C n , we then have 

Cn^C*^ (l + Kn~ a (\ognr), 

and 

c* n = max{c;_ 1 , a} < C^i (1 + i^- Q (logO . 

Consequently, 

c* n < c 2 * n (i + ^r a (iogj) Q )- 

2<7<fi 

Since the finite product on the right-hand side is convergent, we conclude that the sequence C* 
is bounded, or more precisely, 

c:^c* 2 H(i + Kr a (\ogjr). 

Thus we obtain the upper bound 

where C > is an absolute constant depending only on p, a, (3 and £. 

With this bound and defining a n := ^ <?<n n n,j a j, we can rewrite the recurrence relation 
(4.6) as 

= E ^ + E a *- ( 4 - 9 ) 

Now by the estimate (4.8), we see that 

E a i = ( 1 + E ./" °o°gj)° l m 

0<i<n V 2<j<n / 

where := (log tff(t) a . Observe that 

p(gn) ~n- Q (logn) Q 6 n ~ n-^^logn)^ 1 fy. 

Thus 



E 5 i = ° [ ^-"(logn)*- 1 b i) 



The proof of the Proposition is complete by substituting this estimate into (4.9). □ 
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Denote by [z n ] A(z) for the coefficient of z n in the Taylor expansion of A(z). Then, in terms 
of ordinary GFs, the asymptotic transfer (4.7) can be stated alternatively as 

[z n ]A(z) ~ [z n ' B ^ 



\-z 



(when b n satisfies the assumption of Proposition 4.2), which means that the contribution from 
terms in the sum in (3.5) with j ^ lis asymptotically negligible. Roughly, since 



we see that b n j = 0(q j b^ q j n j). We can then give an alternative proof of (4.7) by using (3.5). 

By (4.5) and a direct application of Proposition 4.2, we obtain an asymptotic approximation 
to the variance. 

Theorem 4.3. The variance ofY n satisfies 

a 2 n ~C a n- 2 (\og K n) 3 f(n) 2 } (4.10) 

where C a := pj (2q). 
Thus we have 

C>- 2 (logn) 3 . 



Monte Carlo simulations (with n a few hundred) suggest that the ratio V(X n ) /V(Y n ) grows 
concavely, so that one would expect an order of the form rc^(logn)^ for some < (3 < 1. But 
due to the complexity of the problem, we could not run simulations of larger samples to draw 
more convincing conclusions. Asymptotics of V(X n ) remains open. 



5 Asymptotic normality 

We prove in this section that Y n is asymptotically normally distributed by the method of mo- 
ments. Our approach is to start from the recurrence (4.2) for the central moments and the 
asymptotic estimate (4.10) and then to apply inductively the asymptotic transfer result (Propo- 
sition 4.2), similar to that used in our previous papers Hwang (2003); Hwang and Neininger 
(2002). 

Theorem 5.1. The distribution ofY n is asymptotically normal, namely, 

^^4^(o,i), 

where -4 denotes convergence in distribution. 

We will indeed prove convergence of all moments. 
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Proof. By standard moment convergence theorem, it suffices to show that 

f H ! m -r ■ 

s ™ ~ -, — i—r, 777 cr« , if Tn is even, 

M n , m = E(r n - /Un ) m <^ (m/2)!2W2 »' (5>1) 

[ = o(cr™) , if m is odd, 

for m ^ 0. 

The cases when m ^ 2 having been proved above, we assume m ^ 3. By induction 
hypothesis, we have 

M nJk = O (a k ) = O (n- k {\ognf k ' 2 f k {r 
for k < m. Then, by (4.4), 

£ 7r„jM^AV = O (M L?nJ/ n h / 2 /(g 2 n)' 1 ) 

= O (n-\\ogn) 3e / 2 f(qn) e n h/2 f(q 2 n) h ' 

= O (n- 2 ^ h ' 2 {\ognf l l 2+2h f{n) i+h ) . 
It follows (see (4.3)) that, for < £ < m, 

J2 KnjMuXtf = O {n-^ m l 2 (\ognyi 2+2m Kn) m ) ; 

0<j'<n 

and, for 2 ^ k ^ m — 2 and ^ £ ^ m — k, 

M n _ hk KnjM^A™-'*-' = O ( n - / / 2+fc / 2 - 3ro / 2 (logn) < / 2 - fc / 2+am /(n) m ) . 

Thus the main contribution to the asymptotics of T n>m will come from the terms in the second 
group of sums in (4.3) with k = m — 2 and £ = 0. More precisely 

Af n _ lim _ 2 T B , 2 + O (n- 3 / 2 - m (logn) 3 ( m+1 )/ 2 /» m ) . 

Note that T„ j2 ~ 2n(log K n)~V 2 ; see (4.5). 

Thus if m is even, then, by (4.5) and induction hypothesis, 

2m' 

T - m ~((m-2)/2)!2-/2 n " Q°** n W 

2 "' ! ^/ 2 n-^- 1 (log re n)( 3ro / 2+1 ) /(n) m . 



T 

-*■ n,m 



((m-2)/2)!2 m / 2 

Applying the asymptotic transfer result (Proposition 4.2) with a = m, we obtain 



Mn > m ~ (m /2V /2 am/2n " m(lQgn)3m/2/(nr 



m! 



(m/2)!2 m /2 «• 
In a similar manner, we can prove that if m is odd, then 

M n , m = o«). 

This concludes the proof of (5. 1) and the asymptotic normality of Y n . □ 
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6 The random variables Z 



We briefly consider the random variables defined recursively in (1.10). The major interest is 
in understanding the robustness of the asymptotic normality when changing the underlying 
probability distribution from binomial to uniform. 

Theorem 6.1. The mean value of Z n satisfies 

E(Z n ) = Cn-WiH* (l + + + O (n-^)) , (6.1) 

\ 16y/n 1536n v / 

where 

C:=\J- I I 1-- I e'Muw 0.06906 46192... 



2 V 71 Jo V 11 , 

77ze //mi? law of the normalized random variables Z n /E,(Z n ) is not normal 

Zn Az, 



E(Z 

where the distribution of Z is uniquely characterized by its moment sequence and the GF 
C(y) := E m ^i E(Z m )y m / (m • ml) satisfies the nonlinear differential equation 

y 2 C + y£ - C = i/CC, (6-2) 

w/^ c(o) = C'(o) = i. 

Proof. (Sketch) The proof the theorem is simpler and we sketch only the major steps. 
Mean value. First, v n := E(Z n ) satisfies the recurrence 

v n = v n -i + - ^ i/j (n ^ 2), 

0<i<n 

with z/ = 0, and v\ = \. The GF /(z) of E(Z n ) satisfies the differential equation 

J (l-^) 2i 1-z' 

with the initial condition /(0) = 0. Surprisingly, this same equation (and the same sequence 
{v n n\} n , which is A005189 in Encyclopedia of Integer sequences) occurs in the study of two- 
sided generalized Fibonacci sequences; see Fishburn et al. (1988, 1989). The first-order differ- 
ential equation is easily solved and we obtain the closed-form expression 

e l/(l-*) /■!/(!-«) / A 

f( z ) = -±- + L / l-Me-dt. 

1 -z 1 - Z Jo V vj 

From this, the asymptotic approximation (6.1) results from a direct application of the saddle- 
point method (see Flajolet and Sedgewick's book (Flajolet and Sedgewick, 2009, Ch. VIII)); 
see also Fishburn et al. (1989). 
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Asymptotic transfer. For higher moments and the limit law, we are led to consider the fol- 
lowing recurrence. 



i + - a i + bn ( n > 2 )> ( 6 - 3 ) 



n 

Osgj<n 



with a and a\ given. For simplicity, we assume a = b = 0. 

Proposition 6.2. Assume a n satisfies (6.3). If b n ~ cn^u^, where a > 1 and (3 6 IR, then 

c 



a n -n^ z v%. (6.4) 



^+1/2^ 

The proof is similar to that for Proposition 4.2 and is omitted 



Recurrence and induction. By Proposition 6.2 and the following recurrence relation for the 
moment GF Q(y) := E(e z " y ) 

Q n (y) = £ Qj(y) (n > 2), 

0^j<n 

with Qo(y) = 1 and Q\(y) = e y , we deduce, by induction using (6.4), that 

E(Z™) ~ C™C (m > 1), 

where 

Cm = ^ ( m )-(™-i (™^ 2 )> (6-5) 

with Co = Ci = 1- It follows that the function ((y) := Ylm>i CmV m / \vn ■ m!) satisfies the 
differential equation (6.2). 



Unique determination of the distribution. First, by a simple induction we can show, by 
(6.5), that Cm ^ cm\K m for a sufficiently large K > 0. This is enough for justifying the unique 
determination. Instead of giving the details, it is more interesting to note that the nonlinear 
differential equation (6.2) represents another typical case for which the asymptotic behavior 
of its coefficients (E(Z m ) for large m) necessitates the use of the psi-series method recently 
developed in Chern et al. (2012). We can show, by the approach used there, that 

E(Z m ) = m ■ m\p- m (2 + -^— 2 + O ( m - 3 )^ , 

where p > is an effectively computable constant. Note that there is no term of the form m -1 in 
the expansion, a typical situation when psi-series method applies; see Chern et al. (2012). □ 
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Concluding remarks 

The approach we used in this paper is of some generality and is amenable to other quantities. 
We conclude this paper with a few examples and a list of some concrete applications where the 
scale n clogn also appears. 

First, the expected number of independent sets in a random graph (under the £f„ iP model), 
as given in (1.8), satisfies the recurrence (J n := J n + 1) 

Jn = Jn-1 + ( U I l \ k P™ J * ( n > 1)> 

with J = I. Thus the Poisson GF f(z) := e~ z J2 n>0 J n z n /n\ satisfies the equation 

f\z) = f(qz), 

with /(0) = 1. The modified Laplace transform then satisfies the functional equation 

r(s) = i + sr( q s), 

which, by iteration, leads to the closed-form expression 



j>0 



y-i)/2 s i_ 



Thus all analysis as in Section 2 applies with F and G there replaced by 

F{s) := q j(j ~ 1)/2 s j , G(u) := g««l 2 +W)/2 F ( g -W) 



We obtain for example 

/ 



G flog_ 



n l/logK+l/2 



J n = r= exp 

V27T log K n 



log 



log K 71 



2 logK 

V / 



1 + 



(log log n) 
logn 



The same approach also applies to the pantograph equation 

&(z) = a${qz) + #(z) (a>0), 

with $(0) and ^/(^) given, for ^(z) satisfying properties that can be easily imposed. 

Other extensions will be discussed elsewhere. We conclude with some other algorithmic, 
combinatorial and analytic contexts where n clogn appears. 

- Algorithmics: isomorphism testing (see Babai and Qiao (2012); Grosek and Sys (2010); 
Huber (201 1); Miller (1978); Rosenbaum (2012)), autocorrelations of strings (see Guibas 
and Odlyzko (1981); Rivals and Rahmann (2003)), information theory (see Abu-Mostafa 
(1986)), random digital search trees (see Drmota (2009)), population recovery (see Wigder- 
son and Yehudayoff (2012)), and asymptotics of recurrences (see Knuth (1966); O'Shea 
(2004)); 
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- Combinatorics: partitions into powers (see de Bruijn (1948); Mahler (1940); see also 
Fredman and Knuth (1974) for a brief historical account and more references), palin- 
dromic compositions (see Ji and Wilf (2008)), combinatorial number theory (see Cameron 
and Erdos (1990); Lev et al. (2001)), and universal tree of minimum complexity (see 
Chung et al. (1981); Gol'dberg and Livsic (1968)); 

- Probability: log-normal distribution (see Johnson et al. (1994)), renewal theory (see van 
Beek and Braat (1973); Vardi et al. (1981)), and total positivity (see Karlin and Ziegler 
(1996)); 

- Algebra: commutative ring theory (see Campbell et al. (1999)), and semigroups (see 
Kuzmin (1993); Reznykov and Sushchansky (2006); Shneerson (2001)); 

- Analysis: pantograph equations (see Iserles (1993); Kato and McLeod (1971)), eigen- 
functions of operators (see Spiridonov (1995)), geometric partial differential equations 
(see De Marchis (2010)), and g-difference equations (see Adams (1931); Carmichael 
(1912); Di Vizio et al. (2003); Ramis (1992); Zhang (1999, 2012)). 

This list is not aimed to be complete but to show to some extent the generality of the seemingly 
uncommon scale n clogn ; also it suggests the possibly nontrivial connections between instances 
in various areas, whose clarification in turn may lead to further development of more useful 
tools such as those in this paper. 
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